Press the botton 'Toggle code' below to toggle code on and off for entire this presentation.
from IPython.display import display
from IPython.display import HTML
import IPython.core.display as di # Example: di.display_html('<h3>%s:</h3>' % str, raw=True)
# This line will hide code by default when the notebook is exported as HTML
di.display_html('<script>jQuery(function() {if (jQuery("body.notebook_app").length == 0) { jQuery(".input_area").toggle(); jQuery(".prompt").toggle();}});</script>', raw=True)
# This line will add a button to toggle visibility of code blocks, for use with the HTML export version
di.display_html('''<button onclick="jQuery('.input_area').toggle(); jQuery('.prompt').toggle();">Toggle code</button>''', raw=True)
Autograd is an automatic derivative calculator built to differentiate general numpy code, and mathematical functions defined by numpy code in particular.
First we can define any math function we like - for example
\begin{equation} g(w) = \text{tanh}(w) \end{equation}We express this function using numpy - or more specifically a thinly wrapped version of numpy corresponding to the autograd differentiator.
# import thinly wrapped numpy
import autograd.numpy as np
# define a math function
g = lambda w: np.tanh(w)
# import autograd Automatic Differentiator to compute the derivatives
from autograd import grad
# compute the derivative of our input function
dgdw = grad(g)
This derivative function is something we can call just as we can the original function g.
# define set of points over which to plot function and derivative
w = np.linspace(-3,3,2000)
# evaluate the input function g and derivative dgdw over the input points
gvals = [g(v) for v in w]
dgvals = [dgdw(v) for v in w]
# plot the function and derivative
fig = plt.figure(figsize = (7,3))
plt.plot(w,gvals,linewidth=2)
plt.plot(w,dgvals,linewidth=2)
plt.legend(['$g(w)$',r'$\frac{\mathrm{d}}{\mathrm{d}w}g(w)$'],loc='center left', bbox_to_anchor=(0, 0.5),fontsize = 13)
plt.show()
We can compute further derivatives of this input function by using the same autograd function, only this time plugging in the derivative dgdw. Doing this once gives us the second derivative.
# compute the second derivative of our input function
dgdw2 = grad(dgdw)
We can then plot this along with the first derivative and original function.
# define set of points over which to plot function and first two derivatives
w = np.linspace(-3,3,2000)
# evaluate the input function g, first derivative dgdw, and second derivative dgdw2 over the input points
gvals = [g(v) for v in w]
dgvals = [dgdw(v) for v in w]
dg2vals = [dgdw2(v) for v in w]
# plot the function and derivative
fig = plt.figure(figsize = (7,3))
plt.plot(w,gvals,linewidth=2)
plt.plot(w,dgvals,linewidth=2)
plt.plot(w,dg2vals,linewidth=2)
plt.legend(['$g(w)$',r'$\frac{\mathrm{d}}{\mathrm{d}w}g(w)$',r'$\frac{\mathrm{d}^2}{\mathrm{d}w^2}g(w)$'],loc='center left', bbox_to_anchor=(0, 0.5),fontsize = 13)
plt.show()
For a function $g(w)$ we then formally described the tangent line at a point $w^0$ as
\begin{equation} h(w) = g(w^0) + \frac{\mathrm{d}}{\mathrm{d}w}g(w^0)(w - w^0) \end{equation}with the slope here given by the derivative $\frac{\mathrm{d}}{\mathrm{d}w}g(w^0)$.
# create area over which to evaluate everything
w = np.linspace(-3,3,2000); w_0 = 1.0; w_=np.linspace(-2+w_0,2+w_0,2000);
# define and evaluate the function, define derivative
g = lambda w: np.sin(w); dgdw = grad(g);
gvals = [g(v) for v in w]
# create tangent line at a point w_0
tangent = g(w_0) + dgdw(w_0)*(w_ - w_0)
# plot the function and derivative
fig = plt.figure(figsize = (4,3))
plt.plot(w,gvals,c = 'k',linewidth=2,zorder = 1)
plt.plot(w_,tangent,c = [0,1,0.25],linewidth=2,zorder = 2)
plt.scatter(w_0,g(w_0),c = 'r',s=50,zorder = 3,edgecolor='k',linewidth=1)
plt.legend(['$g(w)$','tangent'],loc='center left', bbox_to_anchor=(0, 0.8),fontsize = 13)
plt.show()
In short, with the tangent line $h$ matches $g$ exactly that at $w^0$ both the function value and derivative value are equal.
\begin{array} \ 1. \,\,\, h(w^0) = g(w^0) \\ 2. \,\,\, \frac{\mathrm{d}}{\mathrm{d}w}h(w^0) = \frac{\mathrm{d}}{\mathrm{d}w}g(w^0) \\ \end{array}Likewise we can determine a simple function $h$ that matches $g$ at its second derivative value as well
\begin{array} \ 1. \,\,\, h(w^0) = g(w^0) \\ 2. \,\,\, \frac{\mathrm{d}}{\mathrm{d}w}h(w^0) = \frac{\mathrm{d}}{\mathrm{d}w}g(w^0) \\ 3. \,\,\, \frac{\mathrm{d}^2}{\mathrm{d}w^2}h(w^0) = \frac{\mathrm{d}^2}{\mathrm{d}w^2}g(w^0) \\ \end{array}This can be shown to be (see the associated post for complete details)
\begin{equation} h(w) = g(w^0) + \frac{\mathrm{d}}{\mathrm{d}w}g(w^0)(w - w^0) + \frac{1}{2}\frac{\mathrm{d}^2}{\mathrm{d}w^2}g(w^0)(w - w^0)^2 \end{equation}This can be shown to be (see the associated post for complete details)
\begin{equation} h(w) = g(w^0) + \frac{\mathrm{d}}{\mathrm{d}w}g(w^0)(w - w^0) + \frac{1}{2}\frac{\mathrm{d}^2}{\mathrm{d}w^2}g(w^0)(w - w^0)^2 \end{equation}This is one step beyond the tangent line - a tangent quadratic function - note that the first two terms are indeed the tangent line itself.
# create area over which to evaluate everything
w = np.linspace(-3,3,2000); w_0 = 1.0; w_=np.linspace(-2+w_0,2+w_0,2000);
# define and evaluate the function, define derivative
g = lambda w: np.sin(w); dgdw = grad(g); dgdw2 = grad(dgdw);
gvals = [g(v) for v in w]
# create tangent line and quadratic
tangent = g(w_0) + dgdw(w_0)*(w_ - w_0)
quadratic = g(w_0) + dgdw(w_0)*(w_ - w_0) + 0.5*dgdw2(w_0)*(w_ - w_0)**2
# plot the function and derivative
fig = plt.figure(figsize = (4,3))
plt.plot(w,gvals,c = 'k',linewidth=2,zorder = 1)
plt.plot(w_,tangent,c = [0,1,0.25],linewidth=2,zorder = 2)
plt.plot(w_,quadratic,c = [0,0.75,1],linewidth=2,zorder = 2)
plt.scatter(w_0,g(w_0),c = 'r',s=50,zorder = 3,edgecolor='k',linewidth=1)
plt.legend(['$g(w)$','tangent line','tangent quadratic'],loc='center left', bbox_to_anchor=(-0.2, 0.8),fontsize = 12)
plt.show()
leads to the following degree 3 polynomial
\begin{equation} h(w) = g(w^0) + \frac{\mathrm{d}}{\mathrm{d}w}g(w^0)(w - w^0) + \frac{1}{2}\frac{\mathrm{d}^2}{\mathrm{d}w^2}g(w^0)(w - w^0)^2 + \frac{1}{3\times2}\frac{\mathrm{d}^3}{\mathrm{d}w^3}g(w^0)(w - w^0)^3 \end{equation}More generally setting up the corresponding set of $N+1$ criteria leads to the construction of degree $N$ polynomial
\begin{equation} h(w^0) + g(w^0) + \sum_{n=1}^{N} \frac{1}{n!}\frac{\mathrm{d}^n}{\mathrm{d}w^n}g(w^0)(w - w^0)^n \end{equation}This general degree $N$ polynomial is called the Taylor series approximation of $g$ at the point $w^0$.
It is the degree $N$ polynomial that matches $g$ as well as its first $N$ derivatives at the point $w^0$, and therefore approximates $g$ near this point better and better as we increase $N$.
The degree $N$ polynomial $h(w^0) + g(w^0) + \sum_{n=1}^{N} \frac{1}{n!}\frac{\mathrm{d}^n}{\mathrm{d}w^n}g(w^0)(w - w^0)^n$ is called the Taylor Series of $g$ at the point $w_0$.
We illustrate the first four Taylor Series polynomials for a user-defined input function below, animated over a range of values of the input function.
You can use the slider to shift the point at which each approximation is made back and forth across the input range.
# what function should we play with? Defined in the next line.
g = lambda w: np.sin(2*w)
# create an instance of the visualizer with this function
taylor_viz = calclib.taylor_series_simultaneous_approximations.visualizer(g = g)
# run the visualizer for our chosen input function
taylor_viz.draw_it(num_frames = 200)